Exploiting Properties of Legislative Texts to Improve Classification Accuracy
نویسندگان
چکیده
Organizing legislative texts into a hierarchy of legal topics enhances the access to legislation. Manually placing every part of new legislative texts in the correct place of the hierarchy, however, is expensive and slow, and therefore naturally calls for automation. In this paper, we assess the ability of machine learning methods to develop a model that automatically classifies legislative texts in a legal topic hierarchy. It is investigated whether such methods can generalize across different codes. In the classification process, the specific properties of legislative documents are exploited. Both the hierarchical structure of legal codes and references within the legal document collection are taken into account. We argue for a closer cooperation between legal and machine learning experts as the main direction of future work.
منابع مشابه
Exploiting Associations between Class Labels in Multi-label Classification
Multi-label classification has many applications in the text categorization, biology and medical diagnosis, in which multiple class labels can be assigned to each training instance simultaneously. As it is often the case that there are relationships between the labels, extracting the existing relationships between the labels and taking advantage of them during the training or prediction phases ...
متن کاملA Method to Improve the Accuracy of Remote Sensing Data Classification by Exploiting the Multi-Scale Properties in the Scene
Land use mapping is one of the major applications of remote sensing. While most studies focus on the advanced remote sensing thematic classification algorithms for land use mapping, the scale factor in remote sensing data classification was less recognized. Previous studies showed that while the multi-scale characteristics exist in the remotely sensed data for land use classification, some clas...
متن کاملAn Improvement in Support Vector Machines Algorithm with Imperialism Competitive Algorithm for Text Documents Classification
Due to the exponential growth of electronic texts, their organization and management requires a tool to provide information and data in search of users in the shortest possible time. Thus, classification methods have become very important in recent years. In natural language processing and especially text processing, one of the most basic tasks is automatic text classification. Moreover, text ...
متن کاملConstructing and exploiting an automatically annotated resource of legislative texts
In this paper, we report on the construction of a resource of Swiss legislative texts that is automatically annotated with structural, morphosyntactic and content-related information, and we discuss the exploitation of this resource for the purposes of legislative drafting, legal linguistics and translation and for the evaluation of legislation. Our resource is based on the classified compilati...
متن کاملارائه روشی برای استخراج کلمات کلیدی و وزندهی کلمات برای بهبود طبقهبندی متون فارسی
Due to ever-increasing information expansion and existing huge amount of unstructured documents, usage of keywords plays a very important role in information retrieval. Because of a manually-extraction of keywords faces various challenges, their automated extraction seems inevitable. In this research, it has been tried to use a thesaurus, (a structured word-net) to automatically extract them. A...
متن کامل